Okay, so welcome everybody. Good morning. Welcome back to deep learning. Today we want
to discuss unsupervised deep learning. So let's have a look at this topic. Well, the
motivation is so far we've been working mostly in the domain of supervised learning and there
we had huge data set sizes up to millions of annotated images. We had many objects but
few modalities. But there's also cases and implications. One example is medical images.
We typically have small data set sizes. In many cases you often only have access to,
let's say 30 or 100 patients, but you have one complex object, it's a patient, and you
have many different modalities. You use all kinds of different modalities to diagnose
different kinds of diseases. So we have a little bit of other means to look at data
and at the same time we have really quite a few data sets that are acquired. For example,
in Germany we have 65 CT scans per 1,000 inhabitants, which means that alone in 2014 we had 5 million
CT scans. But the problem with this data is it's highly sensitive. From a CT data set
you can easily identify the person. It's very easy in the head. There was just a recent
study that if you have a volumetric data set, it could be MR or CT, you can reconstruct
of course the surface, the face from it. And of course once you are able to reconstruct
the surface you can render the surface, make it appearing like a photograph, and you can
use that to match a CT data set against a photograph. So from a photograph using advanced
image processing methods, also partially on deep learning, you can extract an estimate
of the surface and then you can match the two surfaces. So it's possible to identify
persons and they are sensitive data because they have data information, probably also
information about pathology. If you have the volumetric data set you can diagnose it, you
can see what kind of pathologies this patient has, and therefore it's very sensitive. There
is a trend to make the data available. So you know that the German Ministry for example
is thinking about making lots of data accessible to researchers, but the problem, even if they
make this data available, even if you make a lot of data available, you still need appropriate
annotation, in particular medical imaging, what you often need are outline segmentation
masks and things like that. And that's not so easy to get, in particular when it's sensitive
data you would have to be able to get somebody to annotate all of this data, and typically
it's not annotated in medical practice in the way you would like to have it for machine
learning methods. So it's also a big problem that you don't need just access to loads
of data, but you also need access to very good labels about this data, and this makes
the whole issue very, very complex and just very simple suggestions here won't really
bring us to the point where we can use the data. So there is some solutions, for example
weekly supervised learning, this is something we will discuss also in a later lecture. So
there you have a weak label. So for example you have the label brushing teeth, and then
you try to localize the decisive factor within the image and then use this as a kind of advanced
labeling procedure. So we will look into how to get segmentation masks from such weekly
supervised training data. This is quite popular because it's not so much effort, of course
just assigning an image to a class is not as much effort as to denote the outline of
the important object in the image. So that's one way you can go, and then there is semi
supervised learning where you have like partial or little data, and then there is unsupervised
learning where you have no labeled data. And this is exactly what we want to talk about.
We want to see which kind of methods can we use where we don't need labels but still can
make something out of the data. So these two are the most relevant for today. Label free
learning has several applications. For example you can use it for so-called manifold learning
or dimensionality reduction. So let's say you have this is a 3D data set where we see
one slice, so it's a point cloud that is encoded in a 3D space. This kind of object is also
called Swiss roll, and if you have them appropriate dimensionality reduction methods you can see
that these data actually live on a 2D manifold. So it's possible to reduce the data set from
three dimensions to two, but you want to preserve the topology of the original data set as best
Presenters
Zugänglich über
Offener Zugang
Dauer
01:17:39 Min
Aufnahmedatum
2020-01-14
Hochgeladen am
2020-01-14 17:39:03
Sprache
en-US